Method and System for Sharpness Enhancement for Coded Video 

BACKGROUND OF THE INVENTION 



1. Cross-Ref erence to Related Applications 

This invention uses the UME of co-pending application, 
Apparatus and Method for Providing a Usefulness Metric 
based on Coding Information for Video Enhancement, 
inventors Lilla Boroczky and Johan Janssen, filed 
concurrently herewith. The present invention is entitled 
to the benefit of Provisional Patent Application Serial 
Number 60/260,845 filed January 10, 2001. 

2 . Field of The Invention 

The present invention is directed to a system and 
method for enhancing the sharpness of encoded/transcoded 
digital video, without enhancing encoding artifacts, which 
has particular utility in connection with spatial domain 
sharpness enhancement algorithms used in multimedia 
devices . 

3. Description of the Related Art 

The development of high-quality multi-media devices, 
such as set-top boxes, high-end TV's, Digital TV's, 
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Personal TV's, storage products, PDA's, wireless internet 
devices, etc., is leading to a variety of architectures and 
to more openness towards new features for these devices. 
Moreover, the development of these new products and their 
ability to display video data in any format, has resulted 
in new requirements and opportunities with respect to video 
processing and video enhancement algorithms. Most of these 
devices receive and/or store video in the MPEG-2 format and 
in the future they may receive/store in the MPEG-4 format. 
The picture quality of these MPEG sources can vary between 
very good and extremely bad. 

Next generation storage devices, such as the blue- 
laser-based Digital Video Recorder (DVR) will have to some 
extent HD (ATSC) capability and are an example of the type 
of device for which a new method of picture enhancement 
would be advantageous. An HD program is typically broadcast 
at 20 Mb/s and encoded according to the MPEG-2 video 
standard. Taking into account the approximately 25 GB 
storage capacity of the DVR, this represents about a two- 
hour recording time of HD video per disc. To increase the 
record time, several long-play modes can be defined, such 
as Long-Play (LP) and Extended-Long-Play (ELP) modes. 

For LP-mode the average storage bitrate is assumed to 
be approximately 10 Mb/s, which allows double record time 
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for ED. As a consequence, transcoding is an integral part 
of the video processing chain, which reduces the broadcast 
bitrate of 20 Mb/s to the storage bitrate of 10 Mb/s. 
During the MPEG-2 transcoding, the picture quality (e.g., 
sharpness) of the video, is most likely reduced. However, 
especially for the LP mode, the picture quality should not 
be compromised too much. Therefore, for the LP mode, post- 
processing plays an important role in improving the 
perceived picture quality. 

To date, most of the state-of-the-art sharpness 
enhancement algorithms were developed and optimized for 
analog video transmission standards like NTSC, PAL and 
SECAM. Traditionally, image enhancement algorithms either 
reduce certain unwanted aspects in a picture (e.g., noise 
reduction) or improve certain desired characteristics of an 
image (e.g., sharpness enhancement). For these emerging 
storage devices, the traditional sharpness enhancement 
algorithms may perform sub-optimally on MPEG encoded or 
transcoded video due to the different characteristics of 
these sources. In the closed video processing chain of the 
storage system, information which allows for determining 
the quality of the encoded source can be derived from the 
MPEG stream. This information can potentially be used to 
increase the performance of image enhancement algorithms. 
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.Because image quality will remain a distinguishing 
factor for high-end video products, new approaches for 
performing image enhancement, specifically adapted for use 
with these sources, will be beneficial. In C-J Tsai, P. 
Karunaratne, N. P. Galatsanos and A. K. Katsaggelos, "A 
Compressed Video Enhancement Algorithm", Proc. of IEEE, 
ICIP'99, Kobe, Japan, Oct. 25-28, 1999, the authors 
propose an iterative algorithm for enhancing video 
sequences that are encoded at low bit rates. For MPEG 
sources, the degradation of the picture quality originates 
mostly from the quantization function. Thus, the iterative 
gradient-projection algorithm employed by the authors uses 
coding information such as quantization step size, 
macroblock types and forward motion vectors in its cost 
function. The algorithm shows promising results for low bit 
rate video, however its main disadvantage is its high 
computational complexity. 

In B. Martins and S. Forchammer, "Improved Decoding o 
MPEG-2 Coded Video", Proc. of IBC'2000, Amsterdam, Th 
Netherlands, Sept. 7-12, 2000, pp. 109-115, the authors 
describe a new concept for improving the decoding of MPEG-2 
coded video. Specifically, a unified approach for 
deinterlacing and format conversion, integrated in the 
decoding process, is proposed. The technique results in 
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considerably higher picture quality than that obtained by 
ordinary decoding. However , to date, its computational 
complexity prevents its implementation in consumer 
applications . 

Both papers describe video enhancement algorithms 
using MPEG coding information and a cost function. However , 
both of these scenarios, in addition to being impractical, 
combine the enhancement and the cost function. A cost 
function determines how much, and at which locations in a 
picture, enhancement can be applied. The problem which 
results from this combination of cost and enhancement 
functions is that only one algorithm can be used with the 
cost function . 

OBJECT AND SUMMARY OF THE INVENTION 

The present invention addresses the foregoing needs by 
providing a system, (i.e., a method, an apparatus, and 
computer-executable process steps) , in which a usefulness 
metric, calculates how much a pixel can be enhanced without 
increasing coding artifacts. 

It is an object of this invention to provide a 
system in which the usefulness metric is separate from the 
enhancement algorithm such that a variety of different 
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enhancement algorithms can be used m conjunction with the 
metric . 

It is a further object of the invention to 
provide a usefulness metric which can be tuned towards the 
constraints of the system such that an optimal trade-off 
between performance and complexity is assured. 

It is a further object of the invention to 
provide a system of image enhancement which will perform 
optimally with encoded and transcoded video sources. 

This brief summary has been provided so that the 
nature of the invention may be understood quickly. A more 
complete understanding of the invention can be obtained by 
reference to the following detailed description of the 
preferred embodiments thereof in connection with the 
attached drawings . 

BRIEF DESCRIPTION OF THE DRAWINGS 

For a better understanding of the invention, reference 

is made to the following drawings: 

Figure 1 is a block diagram of the invention 

Figure 2 is a flowchart of the invention using 

only the coding gain 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 



Figure 1 shows a system in which the present 
invention can be implemented, for example, in a video 
receiver 56. Figure 1 illustrates how a usefulness metric 
(UME) can be applied to, a sharpness enhancement algorithm, 
adaptive peaking, for example. (Other sharpness enhancement 
algorithms, besides adaptive peaking, can also be used.) 
The adaptive peaking algorithm, directed at increasing the 
amplitude to the transient of a luminance signal 2, does 
not always provide optimal video quality for an a priori 
encoded/transcoded video source. This is mainly a result of 
the fact that the characteristics of the MPEG source are 
not taken into account. In the present invention, a UME is 
generated, which does take into account the characteristics 
of the MPEG source. The example algorithm, adaptive 
peaking, is extended to use this UME, thereby increasing 
the performance of the algorithm significantly. 

The adaptive peaking algorithm and the principle of 
adaptive peaking, are well known in the prior art. An 
example is shown in Fig. 1. The algorithm includes four 
control blocks, 6 8 10 12. These pixel-based control blocks 
6 8 10 12 operate in parallel and each calculate a maximum 
allowable gain factor gl g2 g3 g4, respectively, to achieve 
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a target image quality. These control blocks 6 8 10 12 take 
into account particular local characteristics of the video 
signal such as contrast, dynamic range, and noise level, 
but not coding properties. The coding gain block 14 uses 
the usefulness metric (UME) 18 to determine the allowable 
amount of peaking g C odmg 36. A dynamic gain control 16 
selects the minimum of the gains gl 28, g2 30, g3 32, g4 
34, which is added to the g C oding generating a final gain g 
38. The multiplier 22, multiplies the final gain 38 by the 
high-pass signal 20, which has been filtered by the 2D 
peaking filter 4. The adder 24 adds this product to the 
original luminance value of a pixel 2. In this manner, the 
enhanced luminance signal 26 is generated. 

The UME 18 calculates on a pixel by pixel basis, how 
much a pixel or region can be enhanced without increasing 
coding artifacts. The UME 18 is derived from the MPEG 
coding information present in the bitstream. 

Choosing the MPEG information to be used with the UME 
18 is far from trivial. The information must provide an 
indication of the spatio-temporal characteristics or 
picture quality of the video. 

The finest granularity of MPEG information, which can 
be directly obtained during decoding is either block-based 
or macroblock-based. However for spatial (pixel) domain 
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videa enhancement, the UME 18 must be calculated for each 
pixel of a picture in order to ensure the highest picture 
quality . 

One parameter easily extracted from MPEG information 
is the quantization parameter, as it is present for every 
coded macroblock (MB) . The higher the quantization 
parameter, the coarser the quantization, and therefore, the 
higher the quantization error. A high quantization error 
results in coding artifacts and consequently, enhancement 
of pixels in a MB with a high quantization parameter must 
be suppressed more. 

Another parameter that can easily be extracted from 
the MPEG stream is the number of bits spent in coding a MB 
or block. The value of the aforementioned coding 
information is dependent upon other factors including: 
scene content, bitrate, picture type, and motion 
estimation/ compensation . 

Both the quantization parameter and the number of bits 
spent are widely used in rate control calculations of MPEG 
encoding and are commonly used to calculate the coding 
complexity. Coding complexity is defined as the product of 
the quantization parameter and the number of bits spent to 
encode a MB or block. Coding complexity is therefore 
described by the following equation: 
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compl MB/b i 0 ck (k, 1) -mquant (k, 1) * bit 

where mquant is the quantization parameter and bits M B/biock is 
the number of bits of DCT coefficients used to encode the MB 
or block(k,l). The underlying assumption is that the higher 
the complexity of a MB or block with respect to the average 
complexity of a frame, the higher the probability of having 
coding artifacts in that MB or block. Thus, enhancement 
should be suppressed for the pixels of the blocks with 
relatively high coding complexity. 

Accordingly, the UME 18 of pixel (i,j) can be defined 
by the following equation: 



UME(i,j) = 1 - complpixei (i, j ) 12 * compl 

where compl P i xe i (i, j ) is the coding complexity of pixel (i,j) 
and compl is the average coding complexity of a picture. In 
the present invention, compl pixe i (i, j ) is estimated from the 
MB or block complexity map Figure 2 48 by means of bilinear 
interpolation Figure 2 58. 

In one aspect of the invention, UME(i,j) can range 
from 0 to 1. In this aspect, zero means that no sharpness 
enhancement is allowed for a particular pixel, while 1 
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means that the pixel can be freely enhanced without the 
risk of enhancing any coding artifacts. 

The UME equation can be extended, by the addition of a 
term directly related to the quantization parameter, to 
incorporate a stronger bitrate dependency. This can be 
especially advantageous for video that has been encoded at 
a low bitrate. 

For skipped or uncoded MBs/blocks, the UME is 
estimated Figure 2 50 from surrounding values. 

Because the UME 18 is calculated to account for 
coding characteristics, it only prevents the enhancement of 
coding artifacts such as blocking and ringing. Thus, the 
prevention or reduction of artifacts of non-coding origin, 
which might result from applying too much enhancement, is 
addressed by other parts of the sharpness enhancement 
algorithm. 

The aforementioned UME 18 can be combined with any 
peaking algorithm, or it can be adapted to any spatial 
domain sharpness enhancement algorithm. It is also possible 
to utilize coding information Figure 2 46 and incorporate 
scene content related information Figure 2 44, in 
combination with an adaptive peaking algorithm. 

In this embodiment, shown in Figure 2, the four 
control blocks 6 8 10 12 shown in Figure 1 are eliminated. 
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Scene content information, such as edge information 44, is 
incorporated into the coding gain calculation via the edge 
detection 42. The scene-content related information 44 
compensates for the uncertainty of the UME calculation Fig. 
1 18, the uncertainty resulting from assumptions made and 
interpolations applied in its calculation, Fig. 2 58 36. 

In this embodiment, the coding gain of a pixel (i,j) 
36 is determined by summing the UME which is embedded in 
the coding gain calculation 36 with an Edge Map 44 
related term according to the equation below: 

gcodingU, j)=UME(i, j) + gedge(i.j) 

UME is defined above and g edg e is based on edge-related pixel 
information . 

It should be noted that the complexity map 56 of the 
MB/block has an inherited block structure. To decrease this 
non-desirable characteristic of the complexity map 56, a 
spatial low-pass filtering 52 is applied by a filter. An 
example filter kernel, which can be used for low-pass 
filtering is: 
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LPcompljnap 



Another problem is that abrupt frame to frame changes 
in the coding gain for any given pixel can result in 
temporally inconsistent sharpness enhancement, which is 
undesirable. Such changes can also intensify temporally 
visible and annoying artifacts such as mosquito noise. 

To remedy this effect, temporal filtering 54 is 
applied to the coding gain using the gain of the previous 
frame. To reduce the high computational complexity and 
memory requirement, instead of filtering the gain-map, the 
MB or block-based complexity map 48 is filtered temporally 
using an IIR filter 54. The following equation represents 
this processing: 

compl MB /biock (r , s, t) = k * compl M B/biock (r, s, t) + 
seal * (1-k) * compl M B/biock(r, s, t-1) 

where r,s is the spatial coordinate of a MB or block, t 
represents the current picture, k is the IIR filter 
coefficient and seal is a scaling term taking into account 
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the aomplexity differences among different picture types. 
The coding gain 36 is then applied to the adaptive peaking 
algorithm using the frame t 60 to produce an enhanced frame 
t 60. 

The invention can also be applied to HD and SD 
sequences such as would be present in a video storage 
application having HD capabilities and allowing long-play 
mode. The majority of such video sequences are transcoded 
to a lower storage bitrate from broadcast MPEG-2 
bitstreams. For the long play mode of this application, 
format change can also take place during transcoding. Well- 
known SD video sequences encoded, decoded, and then 
processed with the sharpness enhancement algorithm, 
according to the present invention, provide superior video 
quality for a priori encoded or transcoded video sequences 
as compared to algorithms that do not use coding 
information . 

The present invention has been described with respect 
to particular illustrative embodiments. It is to be 
understood that the invention is not limited to the above- 
described embodiments and modifications thereto, and that 
various changes and modifications may be made by those of 
ordinary skill in the art without departing from the spirit 
and scope of the appended claims. 
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